Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Cross-lingual zero-resource named entity recognition model based on sentence-level generative adversarial network
Xiaoyan ZHANG, Zhengyu DUAN
Journal of Computer Applications    2023, 43 (8): 2406-2411.   DOI: 10.11772/j.issn.1001-9081.2022071124
Abstract296)   HTML17)    PDF (963KB)(182)       Save

To address the problem of lack of labeled data in low-resource languages, which prevents the use of existing mature deep learning methods for Named Entity Recognition (NER), a cross-lingual NER model based on sentence-level Generative Adversarial Network (GAN), namely SLGAN-XLM-R (Sentence Level GAN based on XLM-R), was proposed. Firstly, the labeled data of the source language was used to train the NER model on the basis of the pre-trained model XLM-R (XLM-Robustly optimized BERT pretraining approach). At the same time, the linguistic adversarial training was performed on the embedding layer of XLM-R model by combining the unlabeled data of the target language. Then, the soft labels of the unlabeled data of the target language were predicted by using the NER model, Finally the labeled data of the source language and the target language was mixed to fine-tune the model again to obtain the final NER model. Experiments were conducted on four languages, English, German, Spanish, and Dutch, in two datasets, CoNLL2002 and CoNLL2003. The results show that with English as the source language, the F1 scores of SLGAN-XLM-R model on the test sets of German, Spanish, and Dutch are 72.70%, 79.42%, and 80.03%, respectively, which are 5.38, 5.38, and 3.05 percentage points higher compared to those of the direct fine-tuning on XLM-R model.

Table and Figures | Reference | Related Articles | Metrics